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A method of interstrain differentiation of bacteria^ 



Summary of the invention 

5 

The subject invention lies in the field of interstrain 
differentiation of bacteria. A general method has been developed with 
which various types of bacteria can be differentiated into separate 
individual strains . Thus in particular in the clinical setting this 
10 method can suitably be used to determine what strain of bacterium is 
present in a sample. This new method is applicable for discerning between 
various strains of both Gram negative and Gram positive types of 
bacteria. 

15 Background of the invention 

Previously we had disclosed a method called oligotyping for 
interstrain differentiation of Mycobacterium tuberculosis strains in 
W095/31569. It was stated in this document that one of the key factors in 

20 the control of tuberculosis is the rapid diagnosis of the disease and the 
identification of the sources of infection. M. tuberculosis strain typing 
has already proved to be extremely useful in outbreak investigations (6, 
14, 31) and is being applied to a variety of epidemiologic questions in 
numerous laboratories. Traditionally, laboratory diagnosis is done by 

25 microscopy, culturing of the micro-organism* skin testing and X-ray 
imaging. Unfortunately, these methods are often not sensitive, not 
specific and are very time-consuming, due to the slow growth rate of M. 
tuberculosis. Therefore, new techniques like in vitro amplification of M. 
tuberculosis DNA have been developed to rapidly detect the micro-organism 

30 in clinical specimens (14). The ability to differentiate isolates of M. 

tuberculosis by DNA techniques has revolutionarized the potential to 
identify the sources of infection and to establish main routes of 
transmission and risk factors for acquiring tuberculosis by infection 
(1,3-10, 14, 16, 19-22, 25, 26, 27-33). The use of an effective universal 

35 typing system will allow strains from different geographic areas to be 
compared and the movement of individual strains to be tracked. Such data 
may provide important insights and identify strains with particular 
problems such as high infectivity, high virulence and/or multidrug 
resistance. Analysis of large numbers of isolates may provide answers to 

40 long-standing questions regarding the efficacy of BCG vaccination and the 



09/647596 

PCT/NL98/00186 



WO 99/51771 



2 



PCT/NL98/00186 



frequency of reactivation versus reinfection. 

The same problems identified for M. tuberculosis are inherent 
in differentiation of numerous other bacteria. The problems specifically 
arise for potentially epidemic pathogens and for bacteria that infect 
hospitals. A more rapid and simple typing method is required. Preferably 
the testing methods for various bacteria will occur in the same manner 
ensuring routine use for all types of bacteria for which testing is 
required. Preferably a test that can be carried out by non specialised 
personnel using little laboratory space and time is sought after. 

The method disclosed in W095/31569 is based on the DNA 
polymorphism found at a unique chromosomal locus, the "Direct Repeat" 
(DR) region, which is uniquely present in M. tuberculosis complex 
bacteria* This locus was discovered by Hermans et al. (15) in M. bovis 
BCG , the strain used worldwide to vaccinate against tuberculosis. The DR 
region in M. bovis BCG consists of Directly repeated sequences of 36 base 
pairs, which are interspersed by non-repetitive DNA spacers, each 35 to 
41 base pairs in length (15). The number of copies of the DR sequence in 
M. bovis BCG was determined to be ^9, In other strains of the N. 
tuberculosis complex the number of DR elements was found to vary (15). 
The vast majority of the Fl. tuberculosis strains contain one or more 
IS6110 elements in the DR containing region of the genome. 

It has been shown (12) that the genetic diversity in the DR 
region is generated by differences in the DR copy number, suggesting that 
homologeous recombination between DR sequences may be a major driving 
force for the DR-associated DNA polymorphism (12). The high degree of DNA 
polymorphism within a relatively small part of the chromosome makes this 
region well-suited for a PCR-based fingerprinting technique. 

Figure 1 depicts the structure of the DR region of bovis BCG 
as determined previously by Hermans et al. and Groenen et al. (12, 15). 
For the sake of convenience we will designate a DR plus its 3' adjacent 
spacer sequence as a "Direct Variant Repeat" (DVR) . Thus, the DR region 
is composed of a discrete number of DVR's, each consisting of a constant 
part (DR) and a variable part (the spacer). 

The method disclosed in W095/31569 is based on a unique method 
of in vitro amplification of DNA sequences within the DR region and the 
hybridisation of the amplified DNA with multiple, short synthetic 
oligomeric DNA sequences based on the sequences of the unique spacer 
DNA's within the DR region (figure 2). This differs from previous PCR 
methods in the use of a set of primers with both primers having multiple 
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priming sites as opposed to having one of the primers bind to a fixed 
priming site such as to a part of IS6110. Because M. tuberculosis complex 
strains differ in the presence of these spacer sequences, strains can be 
differentiated by the different hybridisation patterns with a set of 
5 various spacer DNA sequences . 

The method consists of in vitro amplification of nucleic acid 
using amplification primers in a manner known per se in amplification 
reactions such as PCR, LCR or NASBA, wherein a pair of primers is used 
comprising oligonucleotide sequences sufficiently complementary to a part 

10 of the Direct Repeat sequence of a microorganism belonging to the M. 

tuberculosis complex of microorganisms for hybridisation to a Direct 
Repeat to occur and subsequently elongation of the hybridized primer to 
take place, said primer being such that elongation in the amplification 
reaction occurs for one primer in the 5* Direction and for the other 

15 primer in the 3 T Direction. Due to the multiple presence of Direct 
Repeats in the microorganisms to be detected the use of such primers 
implies that all the spacer regions will be amplified in an efficient 
manner. In particular it is not necessary for extremely long sequences to 
be produced in order to obtain amplification of spacers located at a 

20 distance from the primer. With the instant selection of the primer pairs 
a heterogenous product is obtained comprising fragments all comprising 
spacer region nucleic acid. Subsequently the detection of the amplified 
product can occur simply by using an oligonucleotide probe directed at 
one or more of the spacer regions one wishes to detect. In order to avoid 

25 hindrance in the amplification reactions the primers can have 
oligonucleotide sequences complementary to non-overlapping parts of the 
Direct Repeat sequence so that when both primers hybridize to the same 
Direct Repeat and undergo elongation they will not be hindered by each 
other. In particular to avoid any hindrance during elongation reactions 

30 when one primer DRa is capable of elongation in the 5 T Direction and the 
other primer DRb is capable of elongation in the 3 ' Direction the DRa is 
selected such that it is complementary to a sequence of the Direct Repeat 
located to the 5* side of the sequence of the Direct Repeat to which DRb 
is complementary. The primer used must have an oligonucleotide sequence 

35 capable of annealing to the consensus sequence of the Direct Repeat in a 
manner sufficient for amplification to occur under the circumstances of 
the particular amplification reaction. A person skilled in the art of 
amplification reactions will have no difficulty in determining which 
length and which degree of homology is required for good amplification 
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reactions to occur. The consensus sequence of the Direct Repeat of 
microorganisms belonging to the M. tuberculosis complex is given in 
sequence id. no, 2 and in figure 1, 

In addition to what has already been disclosed in W095/31569 we 
5 also determined the spoligotypes of M. tuberculosis strains which were 
subcultured for many months both in the laboratory and in guinea pigs , 
The strains selected for this purpose were those used in a previous study 
on the stability of IS 6110 (2) . All subcultured strains displayed the 
identical spoligotype patterns compared with the primary cultures thus 

10 indicating the pace of the molecular clock in this instance is slow 
enough for use in epidemiology of the disease . 

Because of the large success and simplicity of the method for 
Mycobacterium tuberculosis strain differentiation and in view of problems 
in strain differentiation with other microorganisms we used the Direct 

15 Repeat consensus sequence to screen data bases with nucleic acid 
sequences from other microorganisms. Unfortunately no further matches 
were found. The Direct Repeat sequence appeared to be unique for the 
Mycobacterium tuberculosis as did their spacer sequences. As to date no 
function had actually been attributed to the Direct Repeat sequence it 

20 was unexpected that the sequence was universally distributed amoung other 
types of microorganisms. Such would at best be expected if the sequence 
had a function that was required also in other organisms. 

Description of the invention 

25 

Notwithstanding the negative result after screening with the 
Direct Repeat consensus sequence we considered further analysis of known 
sequences by looking for a pattern in the nucleic acid sequences of other 
microorganisms reminiscent of the Direct Repeat-spacer pattern in 

30 Mycobacterium tuberculosis. Quite unexpectedly we found using a 
specifically designed computer programme that such patterns existed in a 
large number of other microorganisms with a broad range of genera, it 
appears that the DR-like sequences are very common in prokaryotes. They 
are however noticeably absent in eukaryotes. Chapter III of Bergeys 

35 Determinative Manual of Bacteriology Ninth edition (11) provides a table 
of characteristics for distinguishing prokaryotes from eukaryotes i.e. 
distinguish bacterium from microoscopic eukaryotes in the shape of mold, 
yeast, algae or protozoans. 

All bacterial sequences analysed revealed the presence of such 
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a sequence structure and thus the oligotyping method illustrated for 
Mycobacterium tuberculosis can be applied for differentiating between all 
strains of bacteria. It was totally unexpected that a consensus structure 
of this type could be universally found. The Direct Repeat sequences 
5 themselves are different between different genera but the general 
framework of a cluster of Direct Repeat sequences, separated by a number 
of non repetitive spacers is universally present in bacterial genomes. 
Considering the fact that thusfar no function has been attributed to such 
a region in Mycobacterium tuberculosis or in fact for any of the 

10 sequences comprising Direct Repeat like regions in any other bacteria for 
which such sequences had been described this is remarkable. 

Bacteria can be divided into Archaebacteria and Eubacteria. The 
eubacteria in turn can de distinguished into Gram-negative and Gram- 
positive bacteria with cell walls and Eubacteria lacking cell walls. 

15 Chapter IV of Bergeys determinative Manual of Bacteriology Ninth edition 
(11) reveals the characteristics for each group. Over a wide range of the 
subgroups in these 4 groups we have found the presence of the consensus 
structure i.e. the presence of DR-like loci. The IV groups have been 
subdivided by Bergey into more than 30 subgroups. We have examples in 

20 Groups 3,4,5 and 6, Group 11, 17, 31 . 32, 33. The method according to 
the invention is particularly of interest for the bacteria that are 
pathogenic for humans. Group 4 comprises Gram negative bacteria. Genera 
from Group 4 are Legionella (which can cause pneumonia) and Legionnaires 
disease, the genus Neisseria (of which Neisseria meningitidis is well 

25 known as causative agent of meningitis and of which Neisseria gonorrhoeae 
is another example) , the genus Pseudomonas (renown for hospital 
infections) and the genus Bordetella (of which Bordetella pertussis is 
well known as causative agent of whooping cough} . In Group 5 bacteria as 
defined in Bergeys Manual the Enterobacteriacae form a family of 30 

30 genera. These bacteria form a particularly interesting group of Gram 
negative bacteria that infect humans. Suitable examples of genera from 
this family are Enterobacter t Escherichia, Shigella, Salmonella, 
Serratia, Klebsiella and Yersinia. Other less well known pathogenic 
Enterobacteriacae genera are Cedeca, Citrobacter, Kluyvera, Leclercia, 

35 Pantoea, Proteus, Providencia and Hafnia. Other Group 5 families are 
Pasteurellaceae with the genus Haemophilus and the family Vibrionaceae 
with the genus Vibrio. Haemophilus influenzae is a leading cause of 
meningitis in children and also other septicemia conditions. Vibrio 
cholerae is the causative agent of cholera, V. parahaemolyticus can cause 
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food poisoning- and V. vulnificus causes highly fatal septicemia* 

Of the Enterobacteriacae Shigella, Escherichia and Salmonella 
are best known and difficult to differentiate. Shigella is an intestinal 
pathogen of humans causing bacillary dysentery. Well known strains are S. 
5 dysenteriae t S. f lexneri , S. boydii, S* sonnei. The genus Salmonella is 
well known for food poisoning. Well known Salmonella strains are S* 
typhimurium, S. arizona, S. choleraesuis , S* bongori. Salmonella are also 
causative agents of typhoid fever, enteric fevers, gastroenteritis and 
septicemia* The genus Serratia bacteria are opportunistic pathogens for 

10 hospitalized humans causing septicemia and urinary tract infections. 
Examples are S. liquefaciens and S. marcescens. Of the Escherichia E. 
coli is best known as major cause of urinary tract infections and 
nosocomial infections including septicemia and meningitis. Other species 
are usually associated with wound infections. 

15 Enterobacter constitutes a problem genus of opportunistic 

pathogens causing burn wound and urinary tract infections occasionally 
also meningitis and septicemia. Well known species are E. cloacae, E. 
sakazakii, E. aerogenes ♦ E. agglomerans, E. gergoviae. Klebsiella are 
also causative agents of bacterxemia, pneumonia t urinary tact and other 

20 human infections in urological, neonatal* intensive care and geriatric 
patients. Klebsiella pneumoniae and K. oxytoca are examples of species in 
the genus. 

Particularly interesting from a clinical point of view are also 
the Gram positive pathogenic bacteria. The genera Streptococcus and 

25 Staphylococcus form examples of such bacteria. Streptococcus pneumoniae, 
Streptococcus pyogenes and Staphylococcus aureus are examples thereof. Of 
the mentioned groups and genera the pathogenic bacteria are of interest. 
These bacteria are dangerous when infecting hospitals in particular. 

Due to the increasing incidence of infection differentiation of 

30 potentially epidemiological organisms is also of interest. Such organisms 
comprise Bordetella pertussis and Neisseria menigitidis the causative 
organism of meningitis is of particular interest. Quite specifically 
pathogenic bacteria infecting hospitals and bacteria capable of causing 
epidemics are targets for the differentiation method according to the 

35 invention . 

The invention consists of a method of in vitro amplification of 
nucleic acid using amplification primers in a manner known per se, in 
amplification reactions such as PCR, LCR or NASBA, wherein a pair of 
primers is used comprising oligonucleotide sequences sufficiently 
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complementary to a part of the Direct Repeat sequence of a bacterium 
other than a microorganism belonging to the M tuberculosis complex of 
microorganisms for hybridisation to a Direct Repeat to occur and 
subsequently elongation of the hybridised primer to take place, said 
5 primers being such that elongation in the amplification reaction occurs 
for one primer in the 5* Direction and for the other primer in the 3* 
Direction, wherein the Direct Repeat is a sequence with a length between 
20-50 base pairs which occurs 5~60 times in a contiguous region of the 
bacterial genome, whereby the Direct Repeat sequences are separated by 

10 spacer sequences with a length of between 20-50 nucleotides, said spacer 
sequences being non repetitive* By using the programme Patscan e.g. on 
the nucleic acid data bases for microorganism genomic sequences such 
motifs and thus also the identities of the various species specific 
Direct Repeats and the corresponding spacer sequences can be obtained. In 

15 the Patscan programme the Direct Repeat can be designated pi with a 
length between 20-50 basepairs then search for pi 20-50 basepairs 
downstream of pi. Thus this pattern in Patscan is described as 
pl=(20* .50) (20. .50)pl{20. ,50)pl. The length of the sequences can be 
varied as can the intermediate distance and the number of times the 

20 Direct Repeat has to occur. A Direct Repeat can often have a length of 
30-40 base pairs with a spacer length of 35"^5 base pairs. Basically we 
looked for a stretch of identical repeat sequences interspersed by spacer 
sequences which do not necessarily share much of their sequence with the 
Direct Repeat of M. tuberculosis . The patscan programme is freely 

25 accessible at the Internet site:http: //www-c.mcs * anl.gov/home/overbeek/- 
PatScan/HTML/patscan.html. The programme was written by Ross Overbeek 
Mathmatics and Computer Science Division Argonne National Laboratory 
Building 221 Room D-236 9700 S. Cass Avenue Arginne IL 60439 USA. 

Most of the Repeats exhibit one or more of the following 

30 characteristics, they end with a sequence similar to GAAAC i.e. exhibit 
at least 3 of the nucleotides of this consensus sequence at the terminus, 
preferably 4 or 5, start with CTTTG, have stretches of 3-4 identical 
bases. The termini can for example be selected from GAAAC, GAAXXC GAACTC, 
GXAAC, GCAAC, GAAA, GAAXC. GAAGC and AAAC * Suitable Termini are provided 

35 in Table II. 

Organisms as diverse as the Archaebacteria e.g* Methanococcus 
jannasschi (Group 31). Haloferax mediterranei (Group 33) ♦ the 
cyanobacteria Calotrix (Group 11), and Anabeana (Group 11), and purple 
bacteria e.g. E.coli (Group 5), Mycobacterium tuberculosis (Group 21) and 
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Thermus thermophilus (Group 4), Archaeoglobus (Group 32) and Thermotoga 
(Group 6) were found to possess DR-like sequences upon analysis of their 
genomes using the Patscan programme . In the subsequent study of 
literature from which these data were derived it also became clear from 
Southern blots that the Repeat sequences were also found in related 
species . 

With regard to the genetic organisation the structures of the 
DR-like loci in the microorganisms is rather variable (figure 3)* In M. 
tuberculosis the DR locus is large and in most isolates it is disrupted 
by an insertion element. This is also the case in T. thermophilus, 
however here the number of DVR's is only 11 and the DR locus is disrupted 
by two insertion elements. In E. coli K12 2 DR loci are present separated 
t>y approximately 22kb; in Anabaena the locus is of intermediate size and 
interrupted by a 130 bp sequence of unknown function or origin. In H. 
mediterranei the DR locus is of intermediate size and not disrupted, 
however there is evidence for a second DR locus on one of the mega 
plasmids found in this organism. In M. jannaschii there is one locus of 
intermediate size but at several other positions in the genome one or a 
few other DVR's are found. In most cases the DVR's are linked to a so- 
called Long Repeat (LR) element of unknown function • Also in M. 
jannaschii mega plasmids are found but in contrast to H. mediterranei 
they do not contain DR sequences . 

Accession numbers for the sequences of various organisms for 
which the DR like loci have been found are provided here. For E. coli and 
Shigella M27059. M27060, U29579, U29580 and M18270, The relevant portions 
of the sequences are also disclosed by Blattner for E. coli. Nakata et al 
reveal in the Journal of Bacteriology (13) that downstream of the iap 
region a sequence of 29 bases appears 14 times 32 or 33 base pairs apart. 
Nucleotide sequences hybridizing to the 29 base pair sequence were also 
detected in Shigella dysenteriae and Salmonella typhimurium. 

A DR-like sequence was found in the contig 214 of S. pyogenes 
M1(ATCC 700294) of the genome sequencing project of the University of 
Oklahoma. Further research into this DR-like sequence in other S. 
pyogenes revealed spacer polymorphism. The DR regions of eight S. 
pyogenes isolates were studied. The DR regions were isolated by PCR using 
primers that were derived from the database (University of Oklahoma, 
serotype Ml ATCC 700294. The sequence data is available under 
-iattp^iwww, genome •ou.edu. This strain contains seven repeats and six 
spacers . 
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Five of the isolates gave a PCR product, these were a M2 
strain, a M4 strain and three Ml strains. The M4 strain contained only a 
single repeat sequence that was flanked by the same sequences as the ATCC 
700294. The M2 strain sequencing did not work, but the size of the PCR 
5 fragment indicated that two repeats are present . The three Ml strains 
were all the same, they contained four repeats and three spacers. The 
repeats were identical to ATCC 700294 , while one of the spacers was 
identical to ATCC 700294 and two were different. 

These studies on S. pyogenes show that the DR regions have 

10 conserved spacers and repeat sequences. 

The Salmonella genomic sequence as sequenced by the University 
of Washington St Louis has also revealed the presence of DR-like 
sequences. The DR exhibits high homology with the Direct Repeat of E. 
coli. One of the contigs revealed 7 Repeats and 6 spacers. 

15 A panel of five E. coli isolates and three Shigella strains 

were studied. The five E. coli isolates were selected to have an optimal 
diversity, they were isolated from different species or geographic 
regions. The Shigella strains are considered separate ( sub) species . See 
Table 1. The isolates were obtained from the collection of Dr. Wim 

20 Gaastra, 



Table 1 



species 


description 


DRl* 


DRII* 


E. coli 184 


American isolate 


Southern 


PCR 


358 


human urinary tract 


Southern 


Southern 


968 


mastitis 


Southern 


PCR 


1008 


chicken 


PCR 


PCR 


1732 


human intestine 


Southern 


PCR 


Shigella disenteriae 


593 


Southern 


PCR 


sonne i 


595 


Southern 


PCr 


boydil 


603 


Southern 


PCR 



The DR regions were identified by Southern blot of genomic DNA and DRi 
and DR11 regions of E. coli K12. When PCR is indicated the DR regions 
35 were identified by the Southern and the PCR. This PCR was done with 
primers derived from the K12 sequence. 



The DRl and DRII sequences that could be amplified by PCR were 
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cloned and sequenced. Somehow the DRI regions could not be amplified by 
PCR using the primers designed on the K12 sequence, while the Southern 
data demonstrate that DRI is present. Apparently, the recognitions sites 
for the primers are polymorphic. The sizes of the DRII regions were found 
to vary greatly between these isolates. The smallest was a single repeat 
in the £. sonnei strain and the largest was a repeat cluster of at least 
15 repeats in E. coli isolate 1008. The sequences of the repeats were 
highly conserved between these isolates. The S. typhimuviurn data is 
obtainable from the Internet http://genome.wustl.edu/gsc/- 
bacterial /salmonella . html . 

The spacer sequences almost all were unique. Approximately 40 
spacers have been sequenced and only three of them were already known 
from a previously sequenced DR region. This indicates a high number of 
different spacer sequences in E. coli. 

Accession number X73453 provides the Halerofax mediterranei 
sequence. The sequence can also be found in Molecular Microbiology 17 of 
1995 in an article by Mojica et al . (17). The Repeat sequence has also 
been found in related species . 

The genomic project of the Methanococcus jannaschii reveals a 
DR-like sequence as is apparent from the Bult et al article in Science 
273 of 1996 (18). The Accession number is U67459 i.a. 

Accession number X8727O for Anabeana sp reveals 17 spacers and 
a LTRR element. These elements also occur in related species of 
cyanobacteria such as Calotrix. The sequence data are provided by 
Masepohl et al in BBA 1307 1996 (23). 

Accession number AE000782 for Archaeoglobus fulgidus reveals 
three DR-like Repeats with the same Repeat sequence and the this has a 
slightly larger but closely related Repeat. The Repeats are present 20-30 
times. The spacers are unique sequences. H.P. Klenk discloses sequence 
data in Nature 390 1997 (24). 

The invention also covers a method of detection of a bacterium, 
said bacterium not belonging to the M * tuberculosis complex of 
microorganisms said method comprising 

1) amplifying nucleic acid from a sample with the amplification method 
according to any of the preceding described embodiments of the 
amplification method according to the invention, followed by 

2) carrying out a hybridisation test in a manner known per se, wherein 
the amplification product is hybridised to an oligonucleotide probe 
or a plurality of different oligonucleotide probes, each 
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oligonucleotide being sufficiently homologous to a part of a spacer 
of the Direct Region of the bacterium to be determined for 
hybridisation to occur to amplified product if such spacer nucleic 
acid was present in the sample prior to amplification, said 
hybridisation step optionally being carried out without prior 
electrophoresis or separation of the amplified product. 
3) detecting any hybridised products in a manner known per se. 

The method can be carried out in a manner such that the 
hybridisation test is carried out using a number of oligonucleotide 
probes, said number comprising at least a number of oligonucleotides 
probes specific for the total spectrum of bacteria it is desired to 
detect. In a suitable embodiment of a method according to the invention 
the oligonucleotide probe is at least seven oligonucleotides long and is 
a sequence complementary to a sequence selected from any of the spacer 
sequences of the Direct Repeat region of the bacterium to be determined 
or is a sequence complementary to fragments or derivatives of said spacer 
sequences, said oligonucleotide probe being capable of hybridising to 
such a spacer sequence and comprising at least seven consecutive 
nucleotides homologous to such a spacer sequence and/or exhibiting at 
least 60% homology, preferably exhibiting at least 80% homology with such 
a spacer sequence. 

Preferably the method according to the invention is carried out 
to determine the presence and nature of a pathogenic bacterium selected 
from the group of Gram negative bacteria of Groups 4 and 5 of Bergeys 
Determinative Manual of Bacteriology ninth edition. Of particular 
interest due to damage caused by such pathogens are bacteria belonging to 
the families Enterobacteriaceae , Pasteurellaceae and Vibrionaceae of 
Group 5, most specifically the Enterobacteriaceae- Also of interest are 
the Gram positive bacteria of Group 1J. Suitable examples of genera of 
the pathogenic bacterium to be detected from the group of Gram negative 
bacteria of Bergeys Determinative Manual of Bacteriology ninth edition 
are Eschericchia, Shigella, Salmonella, Klebsiella, Enterobacter , 
Yersinia, Serratia, Haemophilus, Vibrio, Legionella, Neisseria, 
Pseudomonas and Bordetella. For the group of Gram positive bacterial 
genera Staphylococcus and Streptococcus are targets for the 
differentiation method. 

Suitably in a method according to the invention for 
differentiating the type of bacterium in a sample, said bacterium not 
belonging to the M. tuberculosis complex the hybridisation pattern is 
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compared with that obtained with a reference. Such a reference can be the 
hybridisation pattern obtained with one or more known strains of the 
bacterium to be determined in analogous manner as the strain to be 
determined. Alternatively the reference is a source providing a list of 
spacer sequences and sources thereof, such as a data bank. Table II 
exhibits some suitable examples of sequences that occur as Direct Repeat 
sequences according: to the invention for the genera illustrated. 
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Not only the above methods fall within the scope of the 
invention but also specifically selected primer pairs for carrying out 
such a method. A pair of primers according to the invention is a pair 
wherein both primers comprise oligonucleotide sequences of at least 7 
5 oligonucleotides and are sufficiently complementary to a part of the 
Direct Repeat sequence of the microorganism E. coli for hybridisation to 
occur and subsequently elongation of the hybridised primer to take place, 
said primers being such that elongation in the amplification reaction 
occurs for one primer in the 5* Direction and for the other primer in the 

10 3 ! Direction and wherein sufficiently complementary means said 
oligonucleotide sequence comprises at least seven consecutive nucleotides 
homologous to such a Direct Repeat sequence and/or exhibits at least 60% 
homology, preferably at least 80% homology , most preferably more than 90% 
homology with the corresponding part of the Direct Repeat sequence. 

15 Suitable Direct Repeat sequences are provided in Table II. In particular 
such a primer pair can comprise one primer DRa capable of elongation in 
the 5 f Direction and the other primer DRb capable of elongation in the 3* 
Direction with DRa being complementary to a sequence of the Direct Repeat 
located to the 5' side of the sequence of the Direct Repeat to which DRb 

20 is complementary, the Direct Repeat being present in the Direct Region of 
E. coli. Another suitable pair comprises primers with oligonucleotide 
sequences of at least 7 oligonucleotides and are sufficiently 
complementary to a part of the Direct Repeat sequence of the 
microorganism S. typhimurium for hybridisation to occur and subsequently 

25 elongation of the hybridised primer to take place, said primers being 
such that elongation in the amplification reaction occurs for one primer 
in the 5* Direction and for the other primer in the 3* Direction and 
wherein sufficiently complementary means said oligonucleotide sequence 
comprises at least seven consecutive nucleotides homologous to such a 

30 Direct Repeat sequence in particular the Sequence provided in Table II 
and/or exhibits at least 60% homology, preferably at least 80% homology, 
most preferably more th.-ai 90% homology with the corresponding part of the 
Direct Repeat sequence. In particular such a pair comprises one primer 
DRa capable of elongation in the 5* Direction and the other primer DRb 

35 capable of elongation in the 3* Direction with DRa being complementary to 
a sequence of the Direct Repeat located to the 5* side of the sequence of 
the Direct Repeat to which DRb is complementary, the Direct Repeat being 
present in the Direct Region of S. typhimurium. 

Kits for carrying out a differentiation method according to any 
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of the described embodiments also fall within the scope of the invention. 
Such kits comprise a primer pair according to any of the described 
embodiments and optionally an oligonucleotide probe or a carrier, said 
carrier comprising at least 1 oligonucleotide probe specific for a spacer 
5 region of a bacterium to be determined said bacterium not belonging to M 
tuberculosis complex, preferably the oligonucleotide probe as defined, 
being an oligonucleotide probe of at least 10 nucleotides, preferably 
more than 12 nucleotides, in particular comprising between 12 to kO 
nucleotides, said probe being sufficiently homologous to any of the 

10 spacer sequences or to fragments or derivatives of such spacer sequences 
to hybridise to such a spacer sequence, said oligonucleotide probe 
comprising at least 10 consecutive nucleotides homologous to such a 
spacer sequence and/or exhibiting at least 60% homology, preferably 
exhibiting at least 80"* homology, most preferably exhibiting more than 

15 90% homology with the corresponding part of the spacer sequence. Suitably 
a kit according to the invention comprises a data carrier with required 
reference patterns of the bacterial strain to be determined* 



20 
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DESCRIPTION OF THE FIGURES 

Figure 1 depicts the structure of the DR region of M. bovis BCG as 
determined previously by Hermans et al* and Groenen et al. (12, 15)- For 
5 the sake of convenience we will designate a DR plus its 3 Adjacent spacer 
sequence as a "Direct Variant Repeat" (DVR) . Thus t the DR region is 
composed of a discrete number of DVR's, each consisting of a constant 
part (DR) and a variable part (the spacer). 

10 Figure 2 depicts multiple, short synthetic oligomeric DNA sequences based 
on the sequences of the unique spacer DNA's within the DR region. 

Figure 3 shows the genetic organisation of the structures of the DR-like 
loci in various bacterial species, 
15 depicts the transcription direction of open reading frame (ORF) 

For M. tuberculosis: MTCY 16B7.26, 27 and 30C are unknown genes/proteins . 

For E, coli: iap gene function is alkaline phosphatase isozyme 
20 conversion. ORF f 94 , f 305 , YGCE and f223 are unknown genes/proteins. 

For S. pyogenes: 0RF1 and 2 are unknown genes/proteins* 

For T. thermophilus : ORFC and D are unknown genes /proteins and ORF 1A and 
25 IB are possibly transposases of IS elements 1000 and 1000A. 

For Anabaena: No ORFs were annotated in the flanking sequences. The 130 
bp insert is of unknown origin. 

30 For Haloferax mediterranei : 0RF21 is an unknown gene/protein. Probably 
another repeat cluster is also present on the megaplasmid pHM500, 

For Methanococcus jannaschix: Comprises about 10 repeat clusters, the 
largest one of which comprises 25 repeats. All repeat clusters are 
35 coupled to a Long Repeat (LR) segment of 425bp. There are 18 LR ! s, some 
of which contain only one repeat. Smaller LR segments are also present, 
ALR. In one case, a cluster contains 5 repeats without LR (see ref, 18) 

For M. thermoautrophicum : Two repeat clusters SRI and SRII flanked by 
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LRI, LRU. LRI and LRU are almost identical and are homologues of the LR 
segment of M. j annas chii. SRI and SRII are separated by 500 kb in the 
genome . 

For Thermatoga raaritima: CelA gene encodes cellulase: endo-1 , 4-beta- 
glucanase (EC 3.2.1.4) and CelB is also a cellulase exhibiting 58% 
identity with celA. 

For Archaeoglobus f ulgidus : The SRIA and SRIB repeat clusters have the 
same Repeat Sequence and the SRII Repeat Sequence is also clearly 
homologous. The SR clusters are separated by about 400bp. SRIB and SRII 
are located near tRNA genes. SRIA lies adjacent to an unknown 0RF3. 

Figure 4 

Hybridization Patterns of 17 E. coli isolates. Thirty four different 
spacer oligonucleotides were covalently linked to a membrane and PGR 
amplified DNA of E. coli was hybridized as described (Kamerbeek et al. 
1997) . except that the primers used to amplify the DR locus were specific 
for the DR sequence from E. coli. Note the polymorphism observed in E. 
coli due to the strain-dependent presence or absence of spacer DNA. 

Figure 5 

Hybridization Patterns of k Salmonella typhimurium isolates. Six 
different spacer oligonucleotides were covalently linked to a membrane 
and PCR amplified Salmonella DNA was hybridized as described (Kamerbeek 
et al 1997)* except that the primers used to amplify the DR locus were 
specific for the DR locus of E. coli. Note the polymorphism observed in 
Salmonella due to the strain-dependent presence or absence of spacer DNA. 



